{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Solutions\n",
    "\n",
    "## About the Data\n",
    "In this notebook, we will be working with 2018 stock data for Facebook, Apple, Amazon, Netflix, and Google (obtained using the [`stock_analysis` package](https://github.com/stefmolin/stock-analysis)).\n",
    "\n",
    "## Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 1\n",
    "We want to look at data for the FAANG stocks (Facebook, Apple, Amazon, Netflix, and Google), but we were given each as a separate CSV file. Make them into a single file and store the dataframe of the FAANG data as `faang`:\n",
    "1. Read each file in.\n",
    "2. Add a column to each dataframe indicating the ticker it is for.\n",
    "3. Append them together into a single dataframe.\n",
    "4. Save the result to a CSV file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "faang = pd.DataFrame()\n",
    "for ticker in ['fb', 'aapl', 'amzn', 'nflx', 'goog']:\n",
    "    df = pd.read_csv(f'../../ch_03/exercises/{ticker}.csv')\n",
    "    # make the ticker the first column\n",
    "    df.insert(0, 'ticker', ticker.upper())\n",
    "    faang = faang.append(df)\n",
    "\n",
    "faang.to_csv('faang.csv', index=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 2\n",
    "With `faang`, use type conversion to change the `date` column to datetime and the `volume` column to integers. Then sort by `date` and `ticker`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "faang = faang.assign(\n",
    "    date=pd.to_datetime(faang.date),\n",
    "    volume=faang.volume.astype(int)\n",
    ").sort_values(\n",
    "    ['date', 'ticker']\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ticker</th>\n",
       "      <th>date</th>\n",
       "      <th>open</th>\n",
       "      <th>high</th>\n",
       "      <th>low</th>\n",
       "      <th>close</th>\n",
       "      <th>volume</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>AAPL</td>\n",
       "      <td>2018-01-02</td>\n",
       "      <td>166.9271</td>\n",
       "      <td>169.0264</td>\n",
       "      <td>166.0442</td>\n",
       "      <td>168.9872</td>\n",
       "      <td>25555934</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>AMZN</td>\n",
       "      <td>2018-01-02</td>\n",
       "      <td>1172.0000</td>\n",
       "      <td>1190.0000</td>\n",
       "      <td>1170.5100</td>\n",
       "      <td>1189.0100</td>\n",
       "      <td>2694494</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>FB</td>\n",
       "      <td>2018-01-02</td>\n",
       "      <td>177.6800</td>\n",
       "      <td>181.5800</td>\n",
       "      <td>177.5500</td>\n",
       "      <td>181.4200</td>\n",
       "      <td>18151903</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>GOOG</td>\n",
       "      <td>2018-01-02</td>\n",
       "      <td>1048.3400</td>\n",
       "      <td>1066.9400</td>\n",
       "      <td>1045.2300</td>\n",
       "      <td>1065.0000</td>\n",
       "      <td>1237564</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>NFLX</td>\n",
       "      <td>2018-01-02</td>\n",
       "      <td>196.1000</td>\n",
       "      <td>201.6500</td>\n",
       "      <td>195.4200</td>\n",
       "      <td>201.0700</td>\n",
       "      <td>10966889</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  ticker       date       open       high        low      close    volume\n",
       "0   AAPL 2018-01-02   166.9271   169.0264   166.0442   168.9872  25555934\n",
       "0   AMZN 2018-01-02  1172.0000  1190.0000  1170.5100  1189.0100   2694494\n",
       "0     FB 2018-01-02   177.6800   181.5800   177.5500   181.4200  18151903\n",
       "0   GOOG 2018-01-02  1048.3400  1066.9400  1045.2300  1065.0000   1237564\n",
       "0   NFLX 2018-01-02   196.1000   201.6500   195.4200   201.0700  10966889"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "faang.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 3\n",
    "Find the 7 rows with the highest value for `volume`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ticker</th>\n",
       "      <th>date</th>\n",
       "      <th>open</th>\n",
       "      <th>high</th>\n",
       "      <th>low</th>\n",
       "      <th>close</th>\n",
       "      <th>volume</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>142</th>\n",
       "      <td>FB</td>\n",
       "      <td>2018-07-26</td>\n",
       "      <td>174.8900</td>\n",
       "      <td>180.1300</td>\n",
       "      <td>173.7500</td>\n",
       "      <td>176.2600</td>\n",
       "      <td>169803668</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>53</th>\n",
       "      <td>FB</td>\n",
       "      <td>2018-03-20</td>\n",
       "      <td>167.4700</td>\n",
       "      <td>170.2000</td>\n",
       "      <td>161.9500</td>\n",
       "      <td>168.1500</td>\n",
       "      <td>129851768</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>57</th>\n",
       "      <td>FB</td>\n",
       "      <td>2018-03-26</td>\n",
       "      <td>160.8200</td>\n",
       "      <td>161.1000</td>\n",
       "      <td>149.0200</td>\n",
       "      <td>160.0600</td>\n",
       "      <td>126116634</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>54</th>\n",
       "      <td>FB</td>\n",
       "      <td>2018-03-21</td>\n",
       "      <td>164.8000</td>\n",
       "      <td>173.4000</td>\n",
       "      <td>163.3000</td>\n",
       "      <td>169.3900</td>\n",
       "      <td>106598834</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>182</th>\n",
       "      <td>AAPL</td>\n",
       "      <td>2018-09-21</td>\n",
       "      <td>219.0727</td>\n",
       "      <td>219.6482</td>\n",
       "      <td>215.6097</td>\n",
       "      <td>215.9768</td>\n",
       "      <td>96246748</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>245</th>\n",
       "      <td>AAPL</td>\n",
       "      <td>2018-12-21</td>\n",
       "      <td>156.1901</td>\n",
       "      <td>157.4845</td>\n",
       "      <td>148.9909</td>\n",
       "      <td>150.0862</td>\n",
       "      <td>95744384</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>212</th>\n",
       "      <td>AAPL</td>\n",
       "      <td>2018-11-02</td>\n",
       "      <td>207.9295</td>\n",
       "      <td>211.9978</td>\n",
       "      <td>203.8414</td>\n",
       "      <td>205.8755</td>\n",
       "      <td>91328654</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    ticker       date      open      high       low     close     volume\n",
       "142     FB 2018-07-26  174.8900  180.1300  173.7500  176.2600  169803668\n",
       "53      FB 2018-03-20  167.4700  170.2000  161.9500  168.1500  129851768\n",
       "57      FB 2018-03-26  160.8200  161.1000  149.0200  160.0600  126116634\n",
       "54      FB 2018-03-21  164.8000  173.4000  163.3000  169.3900  106598834\n",
       "182   AAPL 2018-09-21  219.0727  219.6482  215.6097  215.9768   96246748\n",
       "245   AAPL 2018-12-21  156.1901  157.4845  148.9909  150.0862   95744384\n",
       "212   AAPL 2018-11-02  207.9295  211.9978  203.8414  205.8755   91328654"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "faang.nlargest(7, 'volume')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 4\n",
    "Right now, the data is somewhere between long and wide format. Use `melt()` to make it completely long format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ticker</th>\n",
       "      <th>date</th>\n",
       "      <th>variable</th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>AAPL</td>\n",
       "      <td>2018-01-02</td>\n",
       "      <td>open</td>\n",
       "      <td>166.9271</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>AMZN</td>\n",
       "      <td>2018-01-02</td>\n",
       "      <td>open</td>\n",
       "      <td>1172.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>FB</td>\n",
       "      <td>2018-01-02</td>\n",
       "      <td>open</td>\n",
       "      <td>177.6800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>GOOG</td>\n",
       "      <td>2018-01-02</td>\n",
       "      <td>open</td>\n",
       "      <td>1048.3400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>NFLX</td>\n",
       "      <td>2018-01-02</td>\n",
       "      <td>open</td>\n",
       "      <td>196.1000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  ticker       date variable      value\n",
       "0   AAPL 2018-01-02     open   166.9271\n",
       "1   AMZN 2018-01-02     open  1172.0000\n",
       "2     FB 2018-01-02     open   177.6800\n",
       "3   GOOG 2018-01-02     open  1048.3400\n",
       "4   NFLX 2018-01-02     open   196.1000"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "melted_faang = faang.melt(\n",
    "    id_vars=['ticker', 'date'], \n",
    "    value_vars=['open', 'high', 'low', 'close', 'volume']\n",
    ")\n",
    "melted_faang.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 5\n",
    "Suppose we found out there was a glitch in how the data was recorded on July 26, 2018. How should we handle this?\n",
    "\n",
    "> Given that this is a large data set (~ 1 year), we would be tempted to just drop that date and interpolate. However, some preliminary research on that date for the FAANG stocks reveals that FB took a huge tumble that day. If we had interpolated, we would have missed the magnitude of the drop."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
